Modeling Word Durations

نویسنده

  • Venkata Ramana Rao
چکیده

We describe a new method of modeling duration at word level. These duration models are easily trained from the acoustic training data and can be used to rescore N−best lists of recognition hypotheses. The models capture some of the well known durational effects such as prepausal lengthening. They incorporate a simple back off mechanism to handle unseen words during rescoring. Experiments with various large vocabulary conversational speech recognition (LVCSR) evaluation sets showed consistent improvements of 0.7−1.0% in word error rate (WER).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Duration Modeling For Turkish Text-to-Speech Synthesis System

Naturalness of synthetic speech depends on appropriate modeling of prosodic aspects. Mostly, three prosody components are modeled: segmental duration, pitch contour and intensity. In this study, we present our work on modeling segmental duration in Turkish by using machine-learning algorithms. The models predict phone durations based on attributes such as phone identity, neighboring phone ident...

متن کامل

Modeling word durations

We describe a new method of modeling duration at word level. These duration models are easily trained from the acoustic training data and can be used to rescore N−best lists of recognition hypotheses. The models capture some of the well known durational effects such as prepausal lengthening. They incorporate a simple back off mechanism to handle unseen words during rescoring. Experiments with v...

متن کامل

Segmental duration modeling in Turkish

Naturalness of synthetic speech highly depends on appropriate modeling of prosodic aspects. Mostly, three prosody components are modeled: segmental duration, pitch contour and intensity. In this study, we present our work on modeling segmental duration in Turkish using machinelearning algorithms, especially Classification and Regression Trees (CART). The models predict phone durations based on ...

متن کامل

Word duration modeling for word graph rescoring in LVCSR

A well-known unfavorable property of HMMs in speech recognition is their inappropriate representation of phone and word durations. This paper describes an approach to resolve this limitation by integrating explicit word duration models into an HMM-based speech recognizer. Word durations are represented by log-normal densities using a back-off strategy that approximates durations of words that h...

متن کامل

Arabic Handwritten Word Recognition Using HMMs with Explicit State Duration

We describe an offline unconstrained Arabic handwritten word recognition system based on segmentation-free approach and discrete hidden Markov models (HMMs) with explicit state duration. Character durations play a significant part in the recognition of cursive handwriting. The duration information is still mostly disregarded in HMM-based automatic cursive handwriting recognizers due to the fact...

متن کامل

Synchronizing timelines: Relations between fixation durations and N400 amplitudes during sentence reading

We examined relations between eye movements (single-fixation durations) and RSVP-based event-related potentials (ERPs; N400s) recorded during reading the same sentences in two independent experiments. Longer fixation durations correlated with larger N400 amplitudes. Word frequency and predictability of the fixated word as well as the predictability of the upcoming word accounted for this covari...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000